Massive parallel In-Memory Database with GPU-based Query Co-Processor
نویسنده
چکیده
This talk presents work on transforming SQL-IMDB, a commercial available inmemory database system, into a massive parallel, array structured data processor extending the “classic” query engine architecture with GPU based co-processing facilities. The chosen approach is not just a simple re-implementation of common database functionality like sorting, stream processing and joins on GPUs, instead we take a holistic view and extend the entire query engine to work as a genuine, inmemory, GPU supported database engine. We have partitioned the query engine so that both CPU and GPU are doing what they are best at. The new SQL-IMDBg query execution engine is a “Split-Work” engine which takes care to optimize, schedule and execute the query plan simultaneous and in the most efficient way on two (or more) different memory devices. The principal architecture of the engine, based on simultaneous managing multiple memory devices (local/shared/flashmemory ), was a natural fit to include the new GPU/video memory as just another (high speed) memory device. All internal core engine data structures are now based on simple array structures, for maximum parallel access support on multiand many core hardware. Data tables located on GPU video memory can always queried together with CPU localand shared-memory tables in “mixed” query statements. Columns on GPU tables are also accessible through GPU based indexes. A special index structure was developed based on sorted containers supporting both CPU and GPU based index lookups. Table data can be manually and automatically split between CPU and GPU and is held in vertically partitioned columns, which ease the stream like processing for basic scan primitives and coalesced memory access mechanism on GPU devices. Based on our experience gained, we see the GPU/video memory as another important high speed memory device for in-memory database systems, but which do not yet fit well into the architecture of current database engines and therefore require a major effort in re-engineering the entire core database architecture.
منابع مشابه
GI - Workshop Grundlagen von Datenbanken 31 . 05 . 2011 - 03 . 06 . 2011 Obergurgl , Tirol , Österreich
This talk presents work on transforming SQL-IMDB, a commercial available inmemory database system, into a massive parallel, array structured data processor extending the “classic” query engine architecture with GPU based co-processing facilities. The chosen approach is not just a simple re-implementation of common database functionality like sorting, stream processing and joins on GPUs, instead...
متن کاملA First Step Towards GPU-assisted Query Optimization
Modern graphics cards bundle high-bandwidth memory with a massively parallel processor, making them an interesting platform for running data-intensive operations. Consequently, several authors have discussed accelerating database operators using graphics cards, often demonstrating promising speed-ups. However, due to limitations stemming from limited device memory and expensive data transfer, G...
متن کاملRelational Query Co-Processing on Graphics Processors1
Graphics processors (GPUs) have recently emerged as a powerful co-processor for general-purpose computation. Compared with commodity CPUs, GPUs have an order of magnitude higher computation power as well as memory bandwidth. Moreover, new-generation GPUs allow writes to random memory locations, provide efficient inter-processor communication through on-chip local memory, and support a general-p...
متن کاملUnleashing the Hidden Power of Integrated-GPUs for Database Co-Processing
Modern high-performance server systems with their wide variety of compute resources (i.e. multi-core CPUs, GPUs, FPGAs, etc.) bring vast computing power to the fingertips of researchers and developers. To investigate modern CPU+GPU co-processing architectures and to discover their relative marginal return on resources (“bang for the buck”), we compare the different architectures with main focus...
متن کاملManycore processing of repeated range queries over massive moving objects observations
The ability to timely process significant amounts of continuously updated spatial data is mandatory for an increasing number of applications. Parallelism enables such applications to face this data-intensive challenge and allows the devised systems to feature low latency and high scalability. In this paper we focus on a specific data-intensive problem, concerning the repeated processing of huge...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011